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Abstract 

Consider an energy-harvesting receiver that uses the same received signal both for decoding infor¬ 
mation and for harvesting energy, which is employed to power its circuitry. In the scenario where the 
receiver has limited battery size, a signal with bursty energy content may cause power outage at the 
receiver since the battery will drain during intervals with low signal energy. In this paper, we consider 
a discrete memoryless channel and characterize achievable information rates when the energy content 
in each codeword is regularized by ensuring that sufficient energy is carried within every subblock 
duration. In particular, we study constant subblock-composition codes (CSCCs) where all subblocks in 
every codeword have the same fixed composition, and this subblock-composition is chosen to maximize 
the rate of information transfer while meeting the energy requirement. Compared to constant composition 
codes (CCCs), we show that CSCCs incur a rate loss and that the error exponent for CSCCs is also 
related to the error exponent for CCCs by the same rate loss term. We show that CSCC capacity 
can be improved by allowing different subblocks to have different composition while still meeting the 
subblock energy constraint. We provide numerical examples highlighting the tradeoff between delivery 
of sufficient energy to the receiver and achieving high information transfer rates. It is observed that the 
ability to use energy in real-time imposes less of penalty than the ability to use information in real-time. 
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I. Introduction 

Although wireless eharging of portable eleetronie deviees (Sl and implantable biomedieal 
devices 0] has attracted the attention of researchers over the last few years, pioneering work 
on wireless power transfer was conducted over a century ago by Hertz and Tesla [l5l|. Similarly, 
wireless information transfer has a rich history, including works by Popov Jbl, Bose (71, and 
Marconi (HI. In fact, Marconi’s wireless telegraph device, capable of transatlantic radio communi¬ 
cation, helped save over 700 lives during the tragic accident of the Titanic in 1912 O. However, 
the first work in an information-theoretic setting on analyzing fundamental tradeoffs between 
simultaneous information and energy transfer is relatively recent ifTOl . The study of simultaneous 
information and energy transfer is relevant for communication from a powered transmitter to an 
energy-harvesting receiver which uses the same received signal both for decoding information 
and for harvesting energy. The energy harvested by the receiver is employed to power its circuitry. 

The tradeoff between reliable communication and delivery of energy at the receiver was 
characterized in (TOl using a general capacity-power function, where transmitted codewords 
were constrained to have average received energy exceed a threshold. This tradeoff between 
capacity and energy delivery was extended for frequency-selective channels in (m . Since then, 
there have been numerous extensions of the capacity-power function in various settings lfT2l - 
m- Biomedical applications of wireless energy and information transfer have been proposed 
through the use of implanted brain-machine interfaces that receive data and energy through 
inductive coupling 0, (T^ . (TTll . 

However, in practical applications such as biomedical, imposing only an average power 
constraint is not sufficient; we also need to regularize the transferred energy content. This 
is because a codeword satisfying the average power constraint may still cause outage at the 
receiver if the energy content in the codeword is bursty, since the receive energy buffer with 
a relatively small storage capacity may drain during intervals with low signal energy. In order 
to regularize the energy content in the signal, we herein adopt a subblock-constrained approach 
where codewords are divided into smaller subblocks, and every subblock is constrained to carry 
sufficient energy exceeding a given threshold. The subblock length and the energy threshold may 
be chosen to meet the real-time energy requirement at the receiver. 

An alternative to the subblock-constraint is the sliding-window constraint, which we do not 
consider here. Under a sliding-window constraint, each codeword provides sufficient energy 
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within a sliding time window of certain duration. This approach was adopted in ifT^ . [|T9ll . 
where the use of runlength codes for simultaneous energy and information transfer was pro¬ 
posed. In [l20ll . a sliding window constraint was imposed on binary codewords and bounds on 
the capacity were presented for different binary input channels. Note that the sliding-window 
constraint is relatively tighter than the subblock-constraint, since subblock-constraint corresponds 
to the case where the windows are non-overlapping. 

In this paper, we consider a discrete memoryless channel (DMC) and characterize achievable 
information rates when each subblock is constrained to carry sufficient energy. We assume that 
corresponding to transmission of each symbol in the input alphabet, the receiver harvests a 
certain amount of energy as a function of the transmitted symbol. Since different symbols 
may correspond to different energy levels, the requirement of sufficient energy content within 
a subblock imposes a constraint on the composition of each subblock. Towards meeting this 
subblock energy requirement, we introduce the constant subblock-composition codes (CSCCs) 
where all the subblocks in every codeword have the same fixed composition. This subblock- 
composition, quantifying the fraction of different symbols with each subblock, is chosen to 
maximize the rate of information transfer while meeting the energy requirement. Note that if 
denotes a given subblock of length L, then the composition of x{ is the distribution P^l on X 
defined by x G X, where N(x) is the number of occurrences of symbol x in 

subblock xf. 

A. Our Contribution 

For meeting the real-time energy requirement at a receiver which uses the received signal to 
simultaneously harvest energy and decode information, we propose the use of CSCCs tSec. lIII-Al) 
and establish their capacity as a function of the required energy per symbol (Sec. IIII-Bb . 
We show that CSCC capacity can be computed efficiently by exploiting certain symmetry 
properties (Sec. Illl-Ch and present bounds on subblock length for avoiding receiver energy 
outage (Sec. IIII-Db . 

Compared to constant composition codes, we quantify the rate loss incurred due to the 
additional constraint of restricting all subblocks within codewords to have the same composition 
(Sec. IIV-AI) . For a given rate of information transfer, we derive a lower bound for the error 
exponent using CSCC in terms of the error exponent for constant composition codes (Sec. lIV-Bl) . 
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We show that information rates greater than CSCC capacity can be achieved by allowing 
different subblocks to have different composition, while still meeting the energy requirement per 
subblock (Sec. IV]). 

For enabling real-time information transfer, we consider local subblock decoding where each 
subblock is decoded independently (Sec. IVTl) . and compare achievable rates using local subblock 
decoding with those when all the subblocks within a codeword are jointly decoded. We also 
provide numerical results highlighting the tradeoff between delivery of sufficient energy to the 
receiver and achieving high information rates (Sec. I VIII) . 

B. Related Work 

Codes with different constraints on the codewords have been suggested in the past, depend¬ 
ing on the constraints at the transmitter, the properties of the communication channel, or the 
properties of the storage medium. For digital information storage on magnetic medium [|2T]| . 
codewords are usually designed to meet the runlength constraint [[22ll or are optimized for partial 
response equalization with maximum-likelihood sequence detection (PRML) ll23ll . The study of 
information capacity using runlength-limited (RLL) codes on binary symmetric channels (BSC) 
was carried in [|24l - |[2^ . 

A class of binary block codes called multiply constant-weight codes (MCWC), where each 
codeword of length mn is partitioned into m equal parts and has weight w in each part, was 
explored in ETl owing to their potential application in implementation of low-cost authentication 
methods [[28l- Note that MCWC, introduced in It27l as a generalization of constant weight codes 
[[29l . are themselves a special case of CSCCs with input alphabet size equal to two. When 
each codeword in an MCWC is arranged as an m x n array and additional weight constraints 
are imposed on all the columns, the resulting two-dimensional weight constrained codes have 
potential application in optical storage systems [l30l and in power line communications llSTI . 

Power line communications (PLC) requires the power output to be as constant as possible 
so that information transfer does not interfere with the primary function of power delivery. 
One way to achieve this on the PLC channel (which suffers from narrow-band interference, 
white Gaussian noise, and impulse noise [|32l), is to employ permutation codes ll^ where 
each codeword of length n is a permutation of n different frequencies, with each frequency 
viewed as an input symbol. Higher rates of information transfer may be achieved using constant 
composition codes [l34l at the cost of local variation in power while ensuring that the power 
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Fig. 1. Simultaneous information and energy transfer from a transmitter to an energy-harvesting receiver 


expended is same upon eompletion of eaeh eodeword. When the eodeword length is a multiple 
of the frequency alphabet size, the composition may be chosen such that each frequency occurs 
equal number times in each codeword (351. 

The codewords employed by an energy harvesting transmitter are constrained by the instanta¬ 
neous energy available for transmission. The capacity of these constrained codes over an additive 
white Gaussian noise (AWGN) channel has been analyzed when the energy storage capability 
at the transmitter is zero (3^ . infinite (371 . or some finite quantity (38l . (39l . The capacity of 
an AWGN channel with processing cost at an energy harvesting transmitter was characterized 
in (401 . The DMC capacity using an energy harvesting transmitter equipped with a finite energy 
buffer was analyzed in iHTl . A comprehensive summary of the recent contributions in the broad 
area of energy harvesting wireless communications was provided in (42l . 

II. System Model 

Consider communication from a transmitter to a receiver where the receiver uses the received 
signal both for decoding information as well as for harvesting energy (see Fig. [U). We model the 
effective communication channel from the output of a digital modulator at the transmitter to the 
input to an information decoder at the receiver as a DMC. Note that a DMC is characterized by 
input alphabet X, output alphabet y, and a stochastic matrix W : X ^ y with W = {W{y\x) : 
X E X,y E y} where the matrix entry W{y\x) is the probability that the output is y when the 
channel input is x. 

A DMC is a reasonable communication channel model for simultaneous energy and infor¬ 
mation transfer. Consider, for instance, the use of a digital modulator at the transmitter which 
produces symbols from a signal constellation X = {xi,..., x^}. At the receiver, the signal is split 
for use by the energy harvesting module and the information processing module, respectively. 
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The input to the information deeoder at the reeeiver eomprises of one of s quantized values 
y = {yi,... ,ys}, fed by a quantizer in the information proeessing path. For eaeh quantized 
value yi,l < i < s, and eaeh transmitted symbol Xj, 1 < j < r, the likelihood Pr(|/j|xj) can be 
computed based on the effective signal path from the transmit modulator to the quantizer at the 
receiver. The communication channel is thus a DMC with input alphabet X, output alphabet y, 
and channel transition probabilities Pr(?/j|xj). 

In practice, the effective channels seen by the information decoder and the energy harvester 
may be different due to their respective pre-processing stages. A simple time-sharing approach 
to transmitting energy and information simultaneously was suggested in ll43l via interleaving of 
energy signal and information-bearing signal. In [|4^ . practical architectures for simultaneous 
information and energy reception were defined: an “integrated” receiver architecture has shared 
radio frequency chains between the energy harvester and the information decoder, whereas a 
“separated” architecture has different chains. 

In our work, we assume a generic receiver architecture where the received signal is split 
between the energy harvesting path and the information processing path with a static power 
splitting ratio. The effective communication channel seen by the decoder in the information 
processing path is modeled as a DMC. We let b{x) denote the energy harvested by the harvester 
after the signal split at the receiver, when x G is transmitted. Thus, 6 is a map from the input 
alphabet X to the set of non-negative real numbers, and higher energy is carried by symbols 
having higher 6-value. This map is assumed to be time-invariant, and reflects the scenario where 
the statistical nature of the effective communication channel is due to the noise in the receiver 
circuitry, which does not affect the harvested energy. The quantification of b abstracts the specific 
implementation of a chosen receiver architecture, which in turn helps to abstract the problem of 
the code design for simultaneous energy and information transfer from implementation details. 

In order to meet the real-time energy requirement at the receiver, we partition the transmit¬ 
ted codeword into equal-sized subblocks (see Fig. ^ and require that transmitted symbols be 
chosen such that the expected harvested energy in each subblock exceeds a given threshold. 
This threshold is a function of the energy consumption by the receiver circuitry including the 
information decoder. We will denote the subblock length by L and assume that the codeword 
length, denoted n, is a multiple of L. If a transmitted codeword is denoted (Xi,X 2 ,... ,X„), 
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Fig. 2. Transmitted codeword partitioned into subblocks of length L. 


then the eonstraint on suffieient energy within eaeh subbloek ean be expressed as 

1 ^ 

> B, j = (1) 

i=l 

where j is the subbloek index, B denotes the required energy per symbol at the reeeiver, and m 
is the number of subbloeks in a codeword. The choice of the subblock length L depends on the 
energy storage capacity at the receiver; a small energy buffer generally requires relatively small 
value of L to prevent energy outage at the receiver. 

The subblock energy constraint given by © becomes trivial if b{x) is same for all x G X (for 
instance, when the transmitted symbols belong to a phase-shift-keying constellation). However, 
the constraint is non-trivial when 6-values are not constant (for instance, using on-off keying) 
and threshold B satisfies 

^min < 5 < 6max5 (2) 


where 


brain = minb(x), braax = maxb(x). (3) 

In the rest of the paper we assume @ is satisfied, unless otherwise stated. 

For a given subblock j within a codeword, if N{x) denotes the number of occurrences of 
symbol x in the jth subblock, then © can alternately be expressed as 

(4) 

x^X 

Note that N{x)/L denotes the fraction of time when symbol x appears in the subblock. We 
now introduce constant subblock-composition codes which are a nice way to meet the subblock 
energy constraint. 


III. Constant Subblock-Composition Codes 
A. Motivation and Definition 

We have seen that for a given subblock, the energy constraint given by © can equivalently 
be expressed as @ and this constraint is satisfied provided the fraction of time each symbol 
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appears in the subblock is chosen appropriately. This observation motivates the use of codes 
where the composition of each subblock in all codewords is constant and is chosen such that 
dH) is satisfied. A constant subblock-composition code (CSCC) is one in which all codewords 
are partitioned into equal-sized subblocks and each subblock (in all codewords) has the same 
type P. The subblock type P in CSCC is chosen to satisfy the subblock energy constraint 

Ep [b{X)] ^ > B. (5) 


B. Capacity using CSCC 


Let Vl denote the set of all compositions for input sequences of length L. For a given type 
P G Vl, the set of sequences in with composition P is denoted by Tp and is called the 
type class or composition class of P. In a CSCC with subblock-composition P, every subblock 
in a codeword may be viewed as an element of Tp. 

In order to compute the capacity of a CSCC on a DMC, we may view the L uses of the 
original channel as a single use of the induced vector channel having input alphabet Tp and 
output alphabet y^. Since the underlying channel is memory less, the transition probabilities for 
a pair of input and output vectors is the product of the corresponding transition probabilities of 
the underlying channel. If we let xf = xi ... xp and = yi.. .yi he given input and output 
vectors with Xi E X and yi G y, respectively, then the transition probabilities for the induced 
vector channel are: 

L 

W^{yM) = \{W{y^\xi). ( 6 ) 

i=l 

Since each subblock in a codeword may be chosen independently, the capacity using CSCC 
with subblock-composition P, denoted C'cscc(-P)’ equal to 1/L times the capacity of the 
induced vector channel with input alphabet Tp, output alphabet y^, and transition probabilities 
given by db]). Thus if we denote Xf = Xi ... Xl and P/' = Yi... Ip, then 


CbscciP) 


max 


nxL Yp 

L 


max 



L 


max 





(7) 

( 8 ) 
(9) 
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where the last equality follows from the memoryless property of the ehannel. The maximization 
in (|7]) is over the distribution of input veetor in Tp . We will show that the maximum is aehieved 
when the input vectors Xf are uniformly distributed over Tp . 


Theorem 1. The capacity of the induced vector-channel using CSCC with fixed subblock- 
composition P is obtained via a uniform distribution of the input vectors in Tp. 

Proof: See Appendix El ■ 

If we define the set of distributions 

r| ^ {P e Pi : Ep[b{X)] > B}, (10) 

then the capacity using CSCC with subblock energy constraint dB, denoted is defined 

as 

^CSCciB) ^ ( 11 ) 

Perl 


C. Computing CSCC Capacity 


By Theorem [B the maximum is achieved in dH) when Xf is uniformly distributed over Tp. 
The computation of the capacity expression with increasing subblock length L seems challenging 
since the input and output alphabet size for the induced vector channel grows exponentially with 
L. However, we will show that the computational complexity of the CSCC capacity expression 
can be reduced using the following observations. 

First note that the probability distribution for the output vector in the induced vector channel 
is given by 



\n\ 




( 12 ) 


since the input vectors are uniformly distributed over Tp. If yf is another output vector having 
the same composition as yf, then we have Pyi(?/f') = Pyi(yf'). This is because the columns 
W^{y\^\-) and W^{y^\-) of the vector channel transition matrix are permutations of each other 
(see Appendix El)- Thus output vectors having the same composition have equal probability. 
However, even though the input vectors are uniformly distributed, the output vectors in general 
are not uniformly distributed. Also, since the symbols within an input vector x\ G n are not 
independent, in general we have Pyi(|/f') PriVi), where Pyiv) denotes the probability 

of output scalar symbol y. 


June 2, 2015 


DRAFT 



10 


Let Ql denote the set of all compositions for output sequences of length L. When is 
uniformly distributed over Tp, the H{Y-^) term in ® can be expressed as 

= - Pyf(yi)\ogPYf(yi) (13) 

= -E Z (14) 

QSQl 

= E <‘5) 

QgQl L 

where the last equality follows because PyL (p[) is same for all G Tq. Note that we choose 
only one representative vector yf' from each type class Tq in the last equality. 

Secondly, the following proposition shows that the H{Yi\Xi) term in dH) is same for all 
1 < z < L, since the corresponding joint probabilities PxviXi = x,Yi = y) are equal. 


Proposition 1. For a random input vector Xf uniformly distributed over Tp with corresponding 
output vector Y^, the pairwise probability Pxy^Xi = x,Yi = y), for 1 <i < L, satisfies 

N(x) 

PxriXi = x,T = y) = -^Wiy\x) = Pix)Wiy\x). (16) 


Proof: Since 


PxY{X, = x,T = y) = Pi{X, = x)W{y\x), 


(17) 


the claim will be proved if we show Pr(Xi = x) = N{x)/L for all 1 < z < L. As Xf' is 
uniformly distributed over Tp, the Pr(Xj = x) is equal to the ratio of the number of input 
vectors with x at index z to the total number of vectors in Tp. Since 



L\ 


(18) 


and the number of sequences in Tp with x at index z is 

(X(x) - 1)! JJX(x)!’ 

X^X 

the ratio of the quantities given by (fT9l) and (flSl) is equal to Pr(Xj = x) = N{x)/L. 
The next proposition gives a computationally efficient expression for CSCC capacity. 


(19) 


Proposition 2. The CSCC capacity, is given by 

S I E PyM)'°S-pTp- H{Y\X), (20) 

B Q(zQ^ E vvi; 
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where only one representative output vector yf is chosen from every type class Tq, PyL (j/f) is 
given by (fT^ . and H{Y\X) is evaluated using the joint pairwise probability distribution given 
by (dH). 

Proof: Use dH) and (fTTI) to express From Thm. [U a uniform distribution over 

Tp achieves capacity, and hence the entropy term HiY^^) in dH) can be computed using (fTSl) . 
The claim in Prop. [2] follows by further noting that the H{Yi\Xi) term in dH) is the same for all 
1 < z < L, which can be evaluated using the joint pairwise distribution in (fT6l) . ■ 

D. Choice of Subblock Length L 

In this subsection, we derive bounds on subblock length L (as a function of the energy storage 
capacity at the receiver) which will ensure that the receiver never runs out of energy when the 
subblock-composition P is chosen to satisfy dSl). It will be seen that a large energy storage 
capacity allows for larger values of L and hence results in higher rates of information transfer. 

The energy storage capacity at the receiver is denoted Emax and we assume that the receiver 
requires B units of energy per received symbol for its processing. Let E{i) denote the level 
of the energy buffer at the receiver at the completion of z — 1 uses of the channel. The energy 
update equation, for z = 1 , 2 ,..., is given by 


E{i + 1) = min [E^ax, \E{i) + h{Xi) - S|+) , 

where Xi is the symbol transmitted in the zth channel use, and \z\~^ = m.ax.{z, 0). 
We say that an outage occurs during zth channel use if 


( 21 ) 


U(z) + b{X,) < B, 


( 22 ) 


while an overflow event occurs if 


E{i) + b{Xi) -B> E, 


'max • 


(23) 


We partition the input alphabet as X = where 


X^ = {x E X \ b{x) < B} , 
X^ = {xEX\ b{x) > B} . 


(24) 


(25) 


For CSCC with subblock-composition PGFi we define 



(26) 
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where G will be used to eharaeterizes some useful properties of the energy update proeess. 

Lemma 1. The energy update process satisfies the following properties for CSCC with subblock- 
composition P G r^; 

(a) If there is no energy outage or overflow during the reception of the first subblock, then 

P(L + 1) > P(l). 

(b) If E{1) > G, then there is no energy outage during the reception of the first subblock. 

(c) If E{1) > G and E^ax > 2G, then E{L + 1) > G. 

Proof: If there is no energy outage or overflow, then the total energy harvested during the 
reeeption of the first subbloek is LP(x)6(a;), while the total energy eonsumed is LB and 

claim (a) follows since P satisfies ([5]). 

Let Xi denote the transmitted symbol in the ith channel use, / = and /< = 

{i G I\Xi G fLa}. For i £ I, the level in the energy buffer decreases during the zth channel 
use if and only if i G /<, and the corresponding decrease in energy level is P — h{Xi). Since 
the subblock has composition P, the sum of energy decrements over the reception of the first 
subblock is B — b{Xi) = G, and claim (b) follows. 

For proving claim (c), we note that the condition P(l) > G implies that there is no energy 
outage during the reception of the first subblock (using claim (6)). Further, if there is no overflow 
then E{L + 1) > E{1) > G (using claim (a)). In case there is energy overflow in the i\h channel 
use for any i G I, we have E(i + 1) = Emax > 2G, and thus E{L + 1) > E{i + 1) — G > G. ■ 
Lemma [T] is useful in proving the following theorem which gives a necessary and sufficient 
condition on subblock length in order to avoid outage. 


Theorem 2. A necessary and sufficient condition on L for avoiding energy outage during the 
reception of CSCC codewords, with subblock-composition P satisfying ([5]), is 


with P(l) > G. 


L < 


En 




(27) 


Proof: See Appendix E ■ 

The initial condition on energy level, P(l) > G, may be ensured by transmitting a preamble, 
consisting of symbols with high energy content, before the transmission of codewords. This 
preamble has bounded length and hence does not affect the channel capacity. 
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IV. Comparing CSCC with Constant Composition Codes 
A. Rate Comparison 

Similar to subblock-composition, a codeword composition represents the fraction of times each 
input symbol occurs in a codeword and a eonstant eomposition code (CCC) is one in whieh all 
codewords have the same eomposition. Note that a CSCC with subblock-eomposition P may 
also be viewed as a CCC with codeword eomposition P, sinee all the subblocks in CSCC have 
the same eomposition. In general for CCC, although all eodewords have the same eomposition, 
different subbloeks within a eodeword may have different compositions. Hence CCCs are richer 
than CSCCs in terms of ehoiee of symbols within eaeh subbloek. CCCs were first analyzed by 
Fano [|45l and shown to be suffieient to achieve eapacity for any diserete memoryless ehannel. 

Let Cccc{P) denote the maximum achievable rate using CCC with eodeword composition 
P. For P G (refer (fT^f. a CCC with codeword composition P will ensure that the average 
reeeived energy per symbol in a eodeword is at least B. However, it may violate the eonstraint 
on providing suffieient energy to the receiver within every subblock duration. For a CCC, we 
have fl45l 

Cccc{P) = I{X- Y) = H{X) - H{X\Y). (28) 


We are interested in quantifying the information rate penalty incurred by using CSCC eom- 
pared to CCC, given by Cccc{P) — CcscciP)- This information rate penalty is the price we pay 
for meeting the real-time energy requirement within every subblock duration, compared to the 
less eonstrained energy requirement per eodeword. Although the rate penalty ean be numerically 
computed by explieit eomputation of Cccc{P) and C^g(jfj{P), the numerical approach has the 
limitation that the computation complexity of CQgQ^{P) inereases with an increase in subbloek 
L. 

In CSCC, since a transmitted subblock Xf is uniformly distributed over Tp, we have 

p. 26] 

P(Xf) = log|r/'| = LH{P) - Lr{L,P), (29) 


where r{L, P) denotes a funetion of L and P given as 


s(P)-l 1 

r{L,P) = ^^log{2nL) + — 




a:P{a)>0 


(30) 


with s(P) denoting the number of elements x E X with P{x) > 0, and 'd{L, P) is a real number 
between zero and one whieh is ehosen so that (l29l) is satisfied. 
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We now present simple analytical bounds for this rate penalty. The following theorem shows 
that the rate penalty by using CSCC, relative to CCC, is bounded by r(L, P). 

Theorem 3. The rate penalty is bounded as 

0 < Cccc{P) - CSscciP) < r{L, P). (31) 

Further, there exist channels for which the rate penalty meets the upper or lower bound in (l3TI) 
with equality. 

Proof: When Xf is uniformly distributed over Tp, 

Cbscc(P) = J iH(Xb) - HiXbXb)] (32) 

'S H(P) - t(l, p)--J2h (a'.if^, .y;-') 

(b) I 

> H(P)-r(L,P)--J2H(X,\Y,) 

i=l 

= H{P) - r(L, P) - H{X\Y) 

= Cccc{P)-r{L,P), (33) 

where Xl~^ denotes Xi ,..., Xj_i, (a) follows from (l29l) and chain rule for entropy, (b) follows 
since conditioning only reduces entropy, (c) follows from (fT^ . and (d) follows from (l28l) . Now, 
(|3TI) follows from (l3^ . Explicit channels can be constructed which meet the bounds in (ISTl) . 

• Cccc{P) = Ccscc{P) = 0 for a binary symmetric channel (BSC) with crossover proba¬ 
bility equal to 0.5. 

• For a noiseless channel, we have Cccc{P) — CcscciP) = ^{^iP) due to equality in (6) 

as ... ,X,.^) = 


Corollary 1. 

lim Cbscc(P) = Cccc(P) (34) 

L—>-oo 

Proof: Note that for a fixed P, the value of r{L, P) as a function of L is non-negative and 
falls roughly as log(L)/L and thus tends to zero as L —)■ cx). Thus (l34l) follows by taking the 
limit L ^ CX) in (ISTl) . ■ 
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Remark. For a fixed subblock length L, the CSCC capacity can be achieved by making the 
number of subblocks in a codeword arbitrarily large and performing joint decoding over all the 
subblocks. However, when the number of subblocks in a codeword are kept constant and the 
subblock length is increased without bounds, then achievable rates using CSCC tend to CCC 
capacity. In particular, when there is only one subblock in a codeword, then the CSCC code is 
same as a CCC code whose capacity can be achieved by making L arbitrarily large. 

The upper bound (l30l) on the rate penalty given by r{L, P) is independent of the underlying 
channel. In general, given a communication channel, the bounds on rate penalty can be further 
improved. Consider, for example, a BSC with crossover probability po where 0 < po < 0.5. For 
this channel, the upper bound can be tightened using Thm. |4l We first define a binary operator * 
and a function h, respectively, as 


a-kb = a(l — 6) -f (1 — a)b. 

(35) 

h{x) = —xlogx — (1 — x) log(l — x). 

(36) 


We employ the above definitions to state the following theorem on bounding the rate penalty 
for a BSC. 

Theorem 4. For a BSC with crossover probability 0 < po < 0.5, input distribution denoted by 
F(0) = Pr(X = 0), P(l) = Pr(X = 1), and 0 < 7 = min(P(0), P(l)) < 0.5 we have, 

0 < Cccc{P) - CcscciP) < Kpo^i) - /i(po^a) < r(P,P), (37) 

where a is chosen such that 

h{a) = h{y) — r{L, P), 0 < a < 0.5 . (38) 

Proof: See Appendix O ■ 

The proof of Theorem |4] uses Mrs. Gerber’s Lemma (MGL) ll47ll . Using an extension ll^ of 
MGL, the upper bound on the rate penalty can similarly be improved for general memoryless 
binary-input symmetric-output channels. In particular, we have the following theorem for the 
binary erasure channel (BEC). 

Theorem 5. For a BEC with erasure probability e > 0, 

Cccc(P) - C7cc(^’) < (1 - P) < r(L, P) (39) 
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Proof: See Appendix |Dl ■ 

For memoryless asymmetrie binary-input, binary-output ehannels, an alternate upper bound on 
the rate penalty (other than (|3T]) 1 may be obtained using the equality of the ehannel characteristic 
function and the gerbator [|49ll . As an example, we have the following theorem for the Z-channel. 

Theorem 6. For a Z-channel with 7 = Pr(X = 1), and po = Pr(l —0), we have 

CccciP) - CSscciP) < h ( 7(1 - Po)) - h («(1 - po )), (40) 

where h{-) is given by (l3^ . and a is chosen such that 

h{a) = h{y) — r{L, P), 0 < a < 0.5 . (41) 

Proof: See Appendix |El ■ 

The rate penalty bound given by (l40l) may sometimes be worse than the bound in (ISTl) . 
depending on 7 and po- In general, the rate penalty for the Z-ehannel ean be upper bounded by 
min (r(L, P), h ( 7(1 - po)) - h (a(l - po))). 


B. Error Exponent Comparison 

In this subseetion, we diseuss the error exponent using CSCC and show that it ean be bounded 
as a funetion of the (eomputationally simpler) error exponent for CCC. 

We now present some definitions and notations whieh will be used in this subseetion. For a pair 
of random variables {X,Y) with Px = P, and eonditional probability distribution Py\x = W, 
we will write H{Y\X) as H{W\P), I{X;Y) as /(P, IF), and the distribution of Y as PIF. 
Thus we have 


PIF(p)^5^P(a;)IF(p|x), yey 

x£X 

H{W\P) = ^P{x)H {W{-\x)) 

x&X 

I{P^ IF) = P(PIF) - P(IF|P) . 

The informational divergence of distributions P and Q is denoted as 


(42) 

(43) 

(44) 

(45) 
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The conditional informational divergence of stochastic matrices V ■. X ^ y and W ■. X ^ y 
with respect to distribution P on X is denoted as 

D{y\\W\P) = '^P{x)D{y{-\x)\\Wf\x)) . (46) 

For CCCs with codeword composition P and information rate P > 0, the sphere packing 
exponent function Il46l of DMC W is given by 


E,p{R,P,W)^ min D{V\\W\P) , 

^ V:I{P,V)<R 


(47) 


with V ranging over all channels V ■. X ^ y, and represents an upper bound on the error 
exponent using best possible codes. For fixed P and W, the function Esp{R, P, W) is a convex 
function of P > 0 (which follows from convexity of D(y\\W\P) and /(P, V) as a function of 
V), positive for R < I{P, W) and zero otherwise. 

The random coding exponent function [l46ll of channel W for CCCs with codeword composition 
P and information rate P > 0 is denoted by Er{R, P,W) and represents a lower bound on 
achievable error exponent. It is related to Esp{R, P, IF) as 


Er{R,P,W) 


jp,p(P,P,IF), ifP>P 

I P5p(P, P, IF) + P - P, if 0 < P < P, 


(48) 


where P is the smallest P at which the convex curve Esp{R, P, IF) meets its supporting line of 
slope —1. 

The structure of V which achieves the minimum in (l47l) for P < /(P, IF) is given by the 
following lemma. For P > J(P, IF), the minimum in (l47l) is equal to zero which is obtained by 
choosing F = IF. 


Lemma 2. For R < I{P, IF), the stochastic matrix V : X ^ y which minimizes D(y\\W\P) 
subject to I {P,V) < R is given by 


V{y\x) = 


W{y\xy-^PV{yy 


EyeyW{m^-^PV{yy^ 
where PV {y) satisfies the set of simultaneous equations 

^ Y- ... ^ Y- P{x)W{y\xy-^PV{y) 

pv{y) = Y,p{^)y{y\^) = Y. 

x&X 


« Y^w{i\xf-PV(yY ' 

yey 


(49) 


(50) 


and s G [0,1] is chosen such that I{P, V) = P. 
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Proof: See Appendix |B 

We remark that the random eoding exponent funetion for a DMC was stated by Fano 
using the distributions V{y\x) and PV{y), given by (1^ and (l50l) . respeetively, and were referred 
to as tilted probability distributions. However, the explieit statement of Lemma [2] seems not to 
have appeared in the literature before. 

The following theorem uses Shannon’s random eoding argument to bound the probability of 
error for CSCC with subblock-composition P on a DMC. It also applies Lemma [2] to compactly 
express the error probability in terms of the sphere packing exponent function. 

Theorem 7. There exists a CSCC with subblock length L, subblock-composition P, and codeword 
length n, transmitting information at rate P > 0 on DMC W, for which the maximum probability 
of error is upper bounded as 

2 exp {-nEsp{R', P, W)) , if P' > P 

exp (^-n (Psp(P, P, kF) + P - P')) , if P' < P, 

where P' = P + r(L, P) and R is the smallest R! at which the convex curve Esp{R', P, W) 
meets its supporting line of slope —1. 

Proof: See Appendix O ■ 

The following corollary is immediate. 

Corollary 2. The error exponent for CSCC with subblock length L, subblock-composition P, 
information rate P > 0 on DMC W, is lower bounded by 


P. < 


Er{R + r{L,P),P,W). 


(52) 


Thus the bound on the error exponent for CSCC is related to the error exponent for CCC by 
the same term, r(L, P), as the bound for the rate penalty (ISTI) . 


V. Beyond Constant Subblock Composition Codes 

In a CSCC, every subblock within any codeword has the same composition, and this compo¬ 
sition is chosen to meet the subblock energy constraint dS]). The capacity using CSCC (given by 
(fTTI) l is achieved by choosing that subblock-composition in T^ (given by (fT0l) l which maximizes 
the information rate. We will see that rates greater than C^g(j(j{B) can be achieved while still 
meeting the subblock energy constraint ©• 
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We first review known results when constraints are placed on the entire codeword (with no 
subblock constraints) ifTOll . Il46l . Let = (Xi, X 2 ,..., X^) denote any codeword of length n. 
If we impose the average energy constraint on codewords, 

1 

-y^h{X,)>B, (53) 

i=\ 

then the channel capacity with this constraint is [fTOl . Il46l 

max 1{X-Y). (54) 

Px:Ep^[f,(X)]>i3 

Information rates arbitrarily close to this capacity can be achieved by making the codeword 
length sufficiently large. Moreover, if PJ is an input distribution which maximizes (l54l) . then 
this capacity can be achieved by a sequence of CCCs with codeword composition tending to 
Px fl45l - fl4^ - Thus, if Cccc{B) denotes the capacity using CCC when the average energy per 
symbol is constrained to be at least B, then 

Cccc{P) = max Cccc{P) (55) 

P:Ep[b(X)]>B 

max I{X]Y). (56) 

Px--V.p^HX)\>B 

Thus the capacity with codeword constraints can be achieved by restricting the codewords to 
have a fixed composition. This is possible because for a given transmission rate, the codebook 
size increases exponentially with codeword length n while the number of different types of 
sequences only increase polynomially with n. 

We will now show that contrary to the case with codeword constraints, when the constraints 
are applied to fixed sized subblocks then information rates can, in general, be increased by 
not restricting the subblocks to have a fixed composition. Towards this, we define a subblock 
energy-constrained code (SECC) as a code which satisfies the subblock energy constraint given 
by ©. Since all subblocks in SECC satisfy ©, the composition of each subblock belongs to 
the set Vq. 

Eet Cg^(jfj{B) denote the capacity using SECC with subblock length L and average energy 
per symbol at least B. Similar to CSCC, the L uses of the channel in case of SECC induce a 
vector channel with input alphabet 

Perl 
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output alphabet y’", and channel transition probabilities given by ®. Since each subblock may 
be chosen independently, 

^SECci^) — 7 ^ 

where the maximization is over the probability distribution of input vectors in A. For a noiseless 
g-ary channel (Af = ^ = {0,1,..., g — 1}, W{i\i) = 1, z G X), it is easy to check that SECC 
capacity is achieved by the uniform distribution of Xf over A. Thus for the noiseless channel, 
we have ^SECci^) ~ los 1*^1 

For CSCC, the induced vector channel was symmetric (irrespective of the underlying (scalar) 
DMC being symmetric or not), and hence the capacity was achieved with a uniform distribution 
over the input alphabet. In contrast, in case of SECC the induced vector channel need not be 
symmetric even when the underlying DMC is symmetric. This is formalized in the following 
theorem which is proved by providing a counterexample. 

Theorem 8. Uniform distribution of over A may not achieve SECC capacity even when the 
underlying DMC is symmetric. 


Proof: See Appendix |Hl ■ 

Einding the probability distribution which achieves the maximum in (l58l) is not straightforward, 
in general. If (Ta denotes the uniform distribution of Xf over A, then the maximum information 
rate achievable with U_a, denoted acts as a lower bound for Since a CSCC 

can be viewed as a SECC where the input vectors have the same composition, it follows that 
CQscci^) ^ lower bound for Cg^Q(j{B). Thus we have 

^SECci^) ^ ( 59 ) 

The following proposition is useful in reducing the computational complexity of C^^{B). 


Proposition 3. For a random input vector X[ uniformly distributed over A with corresponding 
output vector the pairwise joint probability, for 1 <i < L, satisfies 


P^y{X, = x,y = y)= 

Perl 


i-Ai 


P{x)W{y\ 


x) 


(60) 


Proof: When Xf is uniformly distributed over A, 


Pr(Xf G r/') = 


1^1 


(61) 
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From Prop. [Hit follows that 

Pr(X, = x,Y, = y\X^e T^) = P{x)W{y\x). (62) 

Finally (l60l) follows from (IMl) and (l62l) since PxviXi = x,Yi = y) is equal to 

Pr(X^' e r/-) Pr(X, = x,Y, = y\X^e T^). (63) 

Perl 


Another useful observation with SECC is that if y^ and y^ are two output veetors having the 
same composition, then the eolumns of the induced vector channel transition matrix correspond¬ 
ing to yi and are permutations of eaeh other. This follows from arguments similar to those 
presented in Appendix [A] for CSCC. Thus, if X^ is distributed uniformly over A, then for Vi 
and having the same composition, we have 


PyL{y^) = PyL{y^,) = ^ l^)- 


(64) 




The next proposition gives a eomputationally effieient expression for Clj^{B). 


Proposition 4. {B) can be expressed as 

1 V 


L ^ 

Q&Ql 


|7« I fyf fei) log 


H(Y\X), 


(65) 


PyM) 

where Ql is the set of all compositions for output vectors of length L, only one representative 
output vector y^ is chosen from every type class Tq, PyL^y^) is given by (l64l) . and H{Y\X) 
is evaluated using the joint pairwise probability distribution given by (l60l) . 


Proof: For a DMC, we have 


C^^B) 


H{Y,^)-Y,H{Y,\Xf 


( 66 ) 


\ ^=1 / 

where the probability of j/f G is given by (l64l) . Thus, (1651) follows from (l66l) . (l60l) and the 
observation that output veetors with the same eomposition have equal probability when input 
subbloeks are uniformly distributed over A. ■ 

As diseussed earlier, the energy requirement per subbloek is strieter than the average energy 
requirement per eodeword. Henee, the eapaeity using eodes with subbloek-eonstraint ([U) is less 
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than the capacity using codes with codeword constraint (15^ . Since CCCs achieve capacity with 
codeword constraint Il4^ . we have 

C^SECci^) < Cccc{B). (67) 

From (l59l) and (l67l) it follows that C^gQQ{B) < CgEQ(^{B) < Cccc{B). Further, using (IMl) it 
follows that SECC capacity tends to CCC capacity as L ^ oo. We will compare these capacities 
for different cases in the numerical results section. 

VI. Real-time Information Transfer 

So far, we could ensure real-time energy transfer to the receiver by placing constraints on 
the subblock-composition. For information transfer, although joint decoding of all the subblocks 
within a codeword is preferred for reducing the probability of error, it also causes delay in 
information arrival. 

For enabling real-time information transfer, the receiver may decode each subblock indepen¬ 
dently, and thus avoid waiting for arrival of future subblocks. Here, since the subblock decoding 
proceeds the instant that subblock has been completely received, the information transfer delay 
is only due to subblock transfer time and the corresponding decoding delay. 

When each subblock within the transmitted sequence is decoded independent of other sub¬ 
blocks, then each subblock may itself be viewed as a codeword. We will refer to the independent 
decoding of subblocks as local subblock decoding (LSD). We remark that this subblock based 
decoding is distinct from decoding for locally decodable codes that allows any bit of the message 
to be decoded with high probability by only querying a small number of received bits lISOl . 

A. Local Subblock Decoding 

In case of local subblock decoding, each subblock may be treated as an independent codeword 
since every subblock is decoded independently. We are interested in estimating achievable rates 
with bounded error probability when local subblock decoding is employed. We now provide a 
short review of an existing result on achievable rates for constant composition finite blocklength 
codes. This result will then be used (in Sec. IVIII) to compare rates between local (independent) 
subblock decoding and joint subblock decoding. 

Let M*(n, e) denote the maximum size of length-n constant composition code for a DMC 
with average error probability no larger than e. When the composition of codewords is equal 
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to an input probability distribution which maximizes the mutual information and the ehannel 
satisfies some regularity eonditions, then ll5T]| - [l5^ 

logM*(n, e) = nC — VnV ^ logn + 0(1) (68) 


where C is the ehannel eapaeity, V is the information variance, and Q is the Gaussian Q-funetion 
|[52l . We remark that V is also termed channel dispersion in literature [l54ll . Early results on 
finite bloeklength eapaeity for memory less symmetrie ehannels are due to Weiss ll55ll . whieh 
were generalized for the DMC and strengthened by Strassen If56l . 

When eaeh eodeword has equal number of ones and zeros, the aehievable rate in bits per 
ehannel use for BSC with erossover probability p using CCC is approximated as llSTll : 


with C 


loga M*{n,e) 


C 


pil-p) 


n \ n 

1 +plog2P+ (1 -p) log2(l -p). 


loga 


-—-Q ^(e) + ;^loga^^, 

p 2n 


(69) 


VII. Numerical Results and Discussion 

In this seetion, we provide examples highlighting the tradeoff between delivery of suffieient 
energy to the reeeiver and aehieving high information transfer rates. These results are used to 
draw meaningful insights into choice of subblock length and subblock composition as a function 
of required energy per symbol at the reeeiver. 

Fig. [3] plots as a funetion of B for different values of L for a BSC with erossover 

probability po = 0.1. The 6-values are assumed to be 6(0) = 0 and 6(1) = 1. These 6-values 
refleet the ease of on-off keying where bit-1 (bit-0) is represented by the presenee (absenee) of 
a earner signal. Fig. [3] shows that, in general, the value of information rate given by C'^ 5 cc(-^) 
increases with an inerease in the subbloek length L, for a given B. This is because an inerease in 
L leads to greater ehoiee for input symbols within a subbloek. Note that the smaller the value of 
L, the greater the uniformity in energy distribution within a codeword. The reduction in capacity 
due to ehoiee of smaller L is the price we pay for providing smoother energy content. 

The plot for L = CX 2 is evaluated using (l54l) : this follows from (fTTI) . (l3^ . (l55l) . and the faet 
that \im.L^oor{L,P) = 0. Thus the eurve corresponding to L = oo is same as the Cccc{B) 
eurve. This eurve is a non-inereasing eoneave funetion of B for 0 < 5 < 6max- This elaim 
ean be proved using the approaeh in [fTOl . It is non-inereasing sinee the feasibility set T^ will 
only become smaller on increasing B. The concavity of Cccc{B) follows from the eoneavity 
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Fig. 3. Plot of CQgQc:{B) versus B for BSC with crossover probability po = 0.1, 6(0) = 0, 6(1) = 1. 

of I{X;Y) as a function of probability distribution of X and the fact that for 0 < a < 1, the 
conditions Epj 6 (X)] > Bi and Ep 2 [ 6 (X)] > B 2 imply that 

'^aPi + il-a}P2[K^)] — *^-^1 + (1 ~ Oi)B2. (70) 

The non-increasing concave nature of the capacity-power function was used in ll57ll to show the 
suboptimality of a time-sharing approach to energy and information transfer. 

The CSCC capacity is plotted in Fig. |4]for a BSC as a function of the receiver energy buffer 
size, Ejnax, with B = 0.5. The subblock length L is chosen as a function of Emax to satisfy (1271) . 
Since L increases with increasing values of Emax, the CSCC capacity is an increasing function 
of Emax- For Po = 0.1, the CSCC capacity is limited by the relatively high value of the crossover 
probability, rather than the subblock length, with capacity remaining almost constant as Emax 
is increased beyond 10. On the other hand, for po = 0.01, the CSCC capacity is limited by the 
subblock length (since ‘noise’ is weak). From (1271) we observe that the subblock length tends to 
infinity as Emax tends to infinity, and hence the CSCC capacity corresponding to Emax —)■ cx 3 is 
equal to Cccc{B). 

Fig.Oplots the rate penalty incurred by using CSCC instead of CCC, for a BSC with crossover 
probability po, L = 16, and Pr(0) = Pr(l) = 0.5. As discussed in Sec. IIV-AI the upper bound 
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Fig. 4. Plot of CSCC capacity versus receiver energy buffer size, Emax, with B = 0.5, 5(0) = 0, 5(1) = 1 for BSC with 
crossover probability po = {0.01, 0.1}. 
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Fig. 5. Plot of Cccc{P) — CcscciP) a function of BSC crossover probability po for L = 16 and Pr(0) = Pr(l) = 0.5. 
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Fig. 6. Comparison of capacity of different schemes for a noiseless binary channel with 6(0) = 0, 6(1) = 1. 

on the rate penalty given by r(L, P) is shown to be elose to the exaet value when po ^ 0. Note 
that r{L, P) is independent of the underlying ehannel. A tighter bound on the rate penalty given 
by hipQ-k'^) — h{pQ-kQ) is also plotted (see Theorem H]). These bounds are useful in estimating 
the rate penalty for large values of L when the eomputational eomplexity of CQgQ^{P) beeomes 
high. The bounds on rate penalty may also be used to bound the exaet value of C'cscc(-^) 
large L. 

Fig. 0eompares the eapaeity of CSCC and SECC for a noiseless binary ehannel with 6(0) = 
0, 6(1) = 1 and subbloek length L = 8. Note that the eapaeity eurve for CCC may be viewed 
as the CSCC eapaeity eurve eorresponding to L = cxd. Fig. [6] highlights the potential of 
improving the CSCC eapaeity by using SECCs and allowing different subbloeks to have different 
eompositions while still meeting the subbloek energy eonstraint ([U). With SECCs, the eapaeity 
for a noiseless ehannel is aehieved by a uniform distribution of input veetors and ean thus be 
effieiently eomputed using (1^ . 

Eig. IHeompares eapaeity of different sehemes for L = 8 and B = 0.6, as a funetion of BSC 
erossover probability po- F shows that for po < 0.05, the eapaeity with uniform distribution 
over the set of length L veetors whieh satisfy the subbloek energy eonstraint ([I]), is higher 
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Fig. 7. Comparison of capacity of different schemes for L = 8, B = 0.6, as a function of BSC crossover probability po and 
6 ( 0 ) = 0 , 6 ( 1 ) = 1 . 

compared to CSCC capacity. However, C^J^B) < C'^ 5 C’c(-®) relative higher values of po- 
This observation emphasizes the fact that merely adding more types is not sufficient to increase 
capacity compared to CSCC; we need to choose an appropriate distribution over the enlarged 
alphabet as well. In Fig. |71 we used the Blahut-Arimoto algorithm [l58l . Il59l to compute the 
exact SECC capacity, Cg^cci^)- 

Fig. [8] compares achievable rates using local subblock decoding (LSD) with rates using joint 
subblock decoding for a BSC with crossover probability po = 0.11 when each subblock has 
equal number of zeros and ones (that is, P(0) = P(l) = 0.5). In case of CSCC with LSD, 
each subblock may itself be viewed as a codeword and so the achievable rate is approximated 
by (l69l) with n = L. The achievable rates with LSD are obtained using (l69l) and seen to fall 
significantly as the desired probability of error, e, tends to zero. The red curve plots lower bound 
on C^g(jQ{P) obtained using (iTTl) . Note that CQgQ(j{P) represents the rate with joint subblock 
decoding for which the probability of error can be brought arbitrarily close to zero by increasing 
the number of subblocks in a codeword and then jointly decoding the subblocks. 

Notice that the rate loss decreases as \/l/L with LSD whereas the rate loss with joint decoding 
decreases as log(L)/L. Ensuring the ability to use energy in real-time imposes less of a penalty 
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Fig. 8. Rates for a BSC with crossover probability p = 0.11. 


than the ability to use information in real-time. 

VIII. Reflections 

We proposed the use of CSCC eodes for providing regular energy eontent in a patterned 
energy signal which is used for simultaneous transfer of energy and information. The subblock- 
composition in CSCC was chosen to maximize the rate of information transfer while ensuring that 
the fraction of input symbols carrying high energy within every subblock duration are sufficiently 
large. For characterizing the exact CSCC capacity, we employed a super-letter approach (with 
each subblock being viewed as a single super-letter in an induced vector-channel) and showed 
that CSCC capacity computational complexity can be alleviated by exploiting certain symmetry 
properties. 

The super-letter approach can also be applied to compute CSCC error exponent. However, 
the size of the super-alphabet grows exponentially with subblock length, L, and the cost for 
computing exact CSCC capacity and error exponent may become prohibitive for large L. In 
this scenario, the CSCC capacity and error exponent can be estimated by using their respective 
bounds, derived in Sec. UVl in terms of the capacity and error exponent for constant composition 
codes. Compared to CCC, the use of CSCCs incurs a rate loss due to the constraint restricting 
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the subblocks to have the same composition. We showed that the CSCC error exponent is related 
to the CCC error exponent by the same rate loss term. 

We also showed that CSCC capacity can be increased by allowing different subblocks to 
have different compositions, while still meeting the subblock energy constraint. Although CSCC 
capacity is shown to be achieved by uniform distribution of super-letters, one may have to 
resort to numerical techniques (such as the Blahut-Arimoto algorithm) for obtaining a capacity 
achieving input distribution for the case where different subblocks are permitted to have different 
compositions. 

We provided examples highlighting the tradeoff between delivery of sufficient energy to the 
receiver and achieving high information transfer rates. It was observed that the ability to use 
energy in real-time imposes less of penalty than the ability to use information in real-time. 

We showed that the subblock length in CSCC can be bounded as a function of the receiver 
energy storage capacity to avoid energy outage at the receiver. In scenarios where the energy 
harvested at the receiver upon transmission of an input symbol varies over time, it will be 
appealing to analyze bounds on subblock length which apply energy arrival statistics to ensure 
that the energy outage probability is lower than a certain threshold. Future work may also be 
carried on extending CSCC capacity results to other channel models, such as the AWGN channel 
where the average transmit power is also constrained. 

Other than the application of simultaneous energy and information transfer, CSCCs are also 
suitable candidates for power line communications due to their ability to provide regular energy 
content. The CSCC codes may also find application in other diverse fields. For instance, the 
multiply constant-weight codes (MCWC) proposed in for use in low-cost authentication 
methods are a special case of CSCC with binary input alphabet. Thus, our capacity results for 
CSCC can also be employed as a performance benchmark for practical MCWC codes ifTTll . 
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Appendix A 
Proof of Theorem [H 

We will prove Theorem [T] by first proving some simple lemmas and employing Gallager’s 
definition of a symmetric channel |[60l . 
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If TT denotes any permutation on L letters with 

' k { x {) = 7r(xi,a;2, (a:^(i), a;^(2), ■ ■ ■ ,a:^(L)), (VI) 

then we have the following lemmas. 

Lemma 3. 

(7r(2/f)|7r(xf)) = {y^\x[) (72) 


Proof: For a DMC, we have 

L 

i=l 

L 

= Ww{yi\x,) = W^ {y^\x{) 


Lemma 4. The following sets are equal 

{7r(4)|xf Grp''}=r/' (73) 


Proof: A permutation preserves the composition of a sequence. Thus, tt may be viewed as 
a map vr ; Tp —> Tp. This map is injective by definition of a permutation. Since the set Tp is 
finite, this map is also surjective and hence (17^ follows. ■ 

Lemma 5. The following sets are equal 

{W^ (vr(7/f)|4) : xf e r/} = {W^ {y^) ■ e T/} (74) 


Proof: From Lemma [3] we have (7r(|/f) Ixf) = (?/f |7r“^(xf')). Now (1741) follows 

from Lemma m ■ 

Let the composition of the output vector y^ & he Q and let Tq be the set of all output 
vectors of length L having composition Q. 
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Lemma 6. The following sets are equal 

[W‘- (sf k(4)) : sf e 7j} = [W‘- (sf |lf) : sf £ T^} (75) 


Proof: Similar to Lemma [5l ■ 

We recall Gallager’s definition If^ of a symmetric DMC. 

Definition 1. A DMC is symmetric if the set of outputs can be partitioned into subsets in such a 
way that for each subset the matrix of transition probabilities (using inputs as rows and outputs 
of the subsets as columns) has the property that each row is a permutation of each other row 
and each column (if more than 1) is a permutation of each other column. 

We will show that when CSCC is employed on a DMC, the induced vector-channel is 
symmetric. Note that the underlying (scalar) channel can be any arbitrary DMC (not necessarily 
symmetric). 

Lemma 7. When CSCC with subblock length L is employed on any DMC, the induced vector- 
channel (obtained from L uses of the DMC) is symmetric. 

Proof: The lemma will be proved if we can partition the outputs into subsets such that for 
each subset the matrix of transition probabilities has the property that each row (column) is a 
permutation of each other row (column). 

We now show that if we partition the outputs into subsets such that each subset contains all 
the outputs of a given composition, then the symmetry conditions will be satisfied. 

If Hi G Tq and G Tq for a given composition Q, then since and yf have the same 
composition, we have yf = 7r(?/f) for some permutation tt. Let Tp be the input alphabet for the 
induced vector channel using CSCC with subblock-composition P. Then using Lemma [51 we 
note that the columns of the vector-channel transition matrix corresponding to output subset Tq 
are permutations of each other. Similarly, using Lemma [6] we can prove that the corresponding 
rows are permutations of each other. ■ 

Theorem 9 ( [[60l p. 94]). For a symmetric discrete memoryless channel, capacity is achieved 
by using the inputs with equal probability. 

Finally, Theorem [T] follows directly from Lemma |7] and Theorem |9l ■ 
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Appendix B 
Proof of Theorem [2] 


When L satisfies (1271) . then E^^ax > 2(7. Sinee E{1) > G, the energy level at the start of 
every subbloek is at least G (by reeursive applieation of Lemma [U^c)) and suffieieney follows 
from Lemma [T]^6). 

Now let Li = LP{x), and define 


PM 


Pi{x) = 


E 




P{x)- 


if X G <L<, 


(76) 


P2{X) = 

51 = 

52 = 


0 , 


if X e Af 


0 , 


PM 


PM ’ 

„L I Li -t-Li 


if X G Aid 
if X G A^ 

X ^ '-j-'L—Li' 


{Xi Ix^i G Tp/, x^^+i G 1} 


{xi I x{ 


L—Li ^ fj-'L—Li 
^ ^P2 


) ^L-Li+l 




(77) 

(78) 

(79) 


Clearly Si (ZTp, S 2 C Tp, where Si (resp. S 2 ) denotes the set of subbloeks of length L with 
first (resp. last) Li input symbols belonging to A^,. Note that E{1) > G is neeessary to avoid 
outage beeause if E{1) < G, then outage results when the first subbloek in a eodeword belongs 
to Si- To prove that (1271) is neeessary, we will show that when 


L > 


Er, 


2P(i)(B-6W)’ 


(80) 


then CSCC eodewords exist whieh will result in energy outage at the reeeiver. Here we have 


(7=5^ LP{x) {B - b{x)) > ^ . (81) 

x£X<i 

Let the first subbloek in a given eodeword belong to S 2 . Sinee the last Li symbols (within the 
first subbloek) belong to AT,, we have E{L + 1) = \E{L — Li + 1) — (7|+. If there is no outage 
during the reeeption of the first subbloek, 


E{L + 1) — E{L — Li + 1) — G < Emax — G < Emaxl‘^i (82) 

where the last inequality follows from (IMl) . Now let the seeond subbloek belong to Si. There 
is no energy outage during the reeeption of first Li symbols within the seeond subbloek if and 
only if E{L + 1) > G. However, from (f82l) and (IMl) it follows that E{L + 1) < Emax 1^2. < G, and 
henee outage eannot be avoided in the seeond subbloek. In general, outage results if L satisfies 
(f80l) . and any two adjaeent subbloeks in a eodeword belongs to S 2 and Si, respeetively. ■ 
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Appendix C 
Proof of Theorem [4] 


The strict inequality 0 < Cccc{P) — CcscciP) follows for BSC with crossover probability 
0 < po < 0-5 because 


C, 


cscc 


(F) = y - H(Yt\X{ 


L 

(.0^) 1 


i=l i=l 

~ L L 


. ^=1 
= Cccc{P), 


2=1 


(83) 

(84) 

(85) 

( 86 ) 


where = Yi... Yi_i, the strict inequality (a) follows since F* is related to via X\~^ 
and Xi. The last equality above follows from Prop. [U and (l28l) . 

For subblock-composition P with 0 < 7 = min(P(0), P(l)) < 0.5, the output entropy on a 
BSC is HiY) = h{pQ-k'y) and hence 


Cccc{P) = Kvo^l) - h{po). 

For CSCC, from (l29l) and definition of a, it follows that 

^H(Xi) = H(P)-r(L,P) 

= /i( 7 ) —r{L,P) = h{a). 

Now using (l89l) and applying Mrs. Gerber’s Lemma [|47ll . 

^H{Y^^) > h{po^a), 


and hence 


Cbscc(P) = I 




2 = 1 


> h{poica) - h{po) 


Using (l87l) and (1^ we have 

Cccc{P) - CcscciP) < h{Po * 7 ) - h{Po * «) 


(87) 

( 88 ) 

(89) 

(90) 

(91) 

(92) 

(93) 
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We only have to show that h{po-k'y) — h{po-k a) < r{L,P) for completing the proof. Towards 
this we first observe that when 0 < x < 0.5 and 0 < po < 0.5, then p^ -k x > x. Next we note 
that the derivative of h{x) satisfies 

h'ix) = log^^—(94) 

X 

and hence h'{x) is a monotonically decreasing function of x for 0 < x < 0.5. 

Since h{a) = ( 1 ( 7 ) — r{L, P), we have 

h{pok'y)-h{pQka) <r{L,P) 7=^ 

h{pQ * 7 ) — ( 1 ( 7 ) < h{pQ k a) — h{a). (95) 

If we define /(x) = h{pokx) — h{x) for 0 < x < 0.5, then we have 

/'(x) = (1 - 2po)/i'(Po ^ x) - h\x). (96) 

Hence /'(x) < 0 for 0 < x < 0.5 since h'{x) is monotonically decreasing in x and p^k x > x. 
This in turn implies that /(x) is a strictly monotonically decreasing function of x. It follows 
that 7 ( 7 ) < /(a) (since a < 7 ) and (l95l) is satisfied. ■ 


Appendix D 
Proof of Theorem [5] 

For a EEC with erasure probability e, and 7 = P(0), 


Cccc{P) = {l-e)h{^)- 


(97) 


If a is chosen such that h{a) = ( 1 ( 7 ) — r{L,P), then from (l29l) it follows that H{X^)/L = 
h{a). Now applying an extension of MGL for binary input symmetric channels f[48ll . we get 

H{Y^^)/L > (1 - e)h{a) + h{e). Thus, 


C'liscc(P) = 


and ( 1 ^ follows from 


H{Yp-Y,H{nX,) 

i=l 

, and definition of a. 


> (1 - e)h{a), 


(98) 
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Appendix E 
Proof of Theorem [6] 

For a Z-channel with 7 = Pr(X = 1), po = Pr(l 0), 

Cccc{P) = h ( 7(1 - po)) - iHpo)- (99) 

If 0 < a < 0.5 is chosen such that h{a) = h{^) — r{L,P), then from (l29l) it follows that 
H{X^)/L = h{a). Now applying the extension of MGL for memoryless asymmetric binary- 
input, binary-output channels [|4^ . we get H{Y^)/L > h (a(l — Po))- Thus, 

CcscciP) = h (a(l - Po)) - jh{po), (100) 

and (l40l) follows from (l99l) and (llOOl) . 


Appendix F 
Proof of Femma[2] 

We first note that the functions D{V\\W\P) and I{P,V) are convex functions of V, while 
the constraint ^ x E X h linear in V. Thus the problem of minimization 

of D{V\\W\P) over V subject to I(P, V) < R and X)yev ^(^1^) = 1 is a convex optimization 
problem and can be solved by the method of Fagrange multipliers lIMll . 

Secondly, note that the functions D{y\\W\P) and I{P,V) depend on V only through those 
V{y\x) for which P{x) > 0. Thus we assume, without loss of generality, that P{x) > 0, Vx e X. 
Now consider the Fagrangian ^(F) given by 

aV) = DiV\\W\P) + X{I{P,V)-R) 

V^(2/k)-lj (101) 

X \ y / 

where A > 0. On setting the partial derivative of ^(F) with respect to V{y\x) equal to zero, we 
get 

0 = P{x) log + P{x) log e 

+ XP(x)\og^^^ + u,. (102) 

PV{y) 

On substituting s = A/(l -f A), the above equation can be equivalently be expressed as 

V{y\x) = W(y\xY-‘PV(yr (103) 
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Since using (11031) we get 


exp 


( *'' =^'^(y\-y-‘pnyr, 


and finally using (11031) . (11041) . we have 

V{y\x) = 


W{y\xf-^PV{yy 

Y.,W{y\xY-^PV{yy 


(104) 


(105) 


Note that sinee s = A/(l + A), the value of s ranges from 0 to 1 as A varies from 0 to oo. By 
the complementary slackness property It^ . we have 


A(/(P,1/)-i?) = 0. (106) 

Sinee A = 0 implies s = 0 and henee V = W (using (11051) 1. it follows that /(P, W) = I{P, V) < 
R. This eontradiets the assumption in Lemma [2] that R < I{P, W) and henee A is strictly greater 
than zero. The proof is complete by noting that conditions A > 0 and (11061) imply /(P, V) = R. 


Appendix G 
Proof of Theorem [7] 

The M messages to be transmitted are assumed equiprobable. All input sequences of length 
n with constant subblock-composition P are assigned equal probabilities and the i\h message is 
mapped to a randomly selected input sequenee for 1 < i < M . The deeoder knows the mapping 
used by the eneoder and uses maximum likelihood (ML) decoding. 

The proof uses Fano’s approaeh ll45ll to upper bound the probability of error by employing 
tilted distribution whieh is summarized next. 

Let A be a discrete ensemble eonsisting of points with probability distribution P(a). 

If 0 denotes a random variable associated with this ensemble, and 7 ( 5 ) := log exp (s0(a)) P(a), 
then a family of tilted distributions are: 

Q{a) := exp (s0(a) — 7 ( 5 )) P(a). (107) 

Note that for a fixed s, the derivative 7 '(s) (resp. 7 "(s)) denotes the mean (resp. varianee) of 
the random variable 0 with respeet to the tilted probability distribution Q. For s = 0, we have 
Q = P, and thus 7 ^( 0 ) (resp. 7 ''( 0 )) denotes the true mean (resp. variance) of 0. 

Let ni,i = 1,..., K he positive integers and n = Define the subsets Si = {1,... ,ni} 

and S'*; = {Z e N I Yl^Ii Ui < I < for 2 < Zc < 76. Let a” := ai • • • denote a 
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sequence of n independent events. For any i e 5^, fc = let (j)k{o.i) be the random 

variable associated with the event and i\(a) be the probability Pr[aj = a]. For k = 1,..., K, 
define 

7 fc(s) := log ^ exp (s0fc(a)) Pk{a) (108) 

A 
K 

7(s) := ~7fc(s) (109) 

k=l 

Qk{a) := exp (s0fc(a) - lk{,s)) Pk{a) (110) 

Define the sum of random variables, 

K 

( 111 ) 

k=l i(^Sk 

whose tail probability is given by the following lemma. 


Lemma 8. ( / [?5| p. 265]) Assume 7 (s) and its first and second derivative are finite in the 
interval Si < s < S 2 including s = 12. If t is a real number with 7 '(si) <t< 7 ^( 52 ), then the 
tail probabilities of ^{an) satisfy the following inequalities: 

Pr[$(Q;"') < nt] < exp {—nfi) , 7 '(si) <t < 7^(0) = f (112) 

Pr[<h(a”) > nt] < exp {—nfi) , 7'(0) = f < t < ^'{ 82 } (113) 


where 


K 


k=l 


= sp{s) - 7(s) = X] ^ Qfc(a) log 


Qk{tt) 

Pk(a) 


> 0 


with 8 chosen such that p{ 8 ) = t. 


(114) 


Define the distance between x E X and y ^ y as 

where f{y) is a positive function of y with J2yf{y) = 1. Similarly, the distance between two 
sequences u and v is 

D{u, v) = ^ n{x, y)D{x, y) = log (116) 

A:,y 

where n{x,y) is the number of letter pairs {x,y) in (u, v), 

= ’ W^{Y\u) = l[W{y\x)^^^’^\ (117) 

V a’,y 
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If u is the input sequence and v is the corresponding output sequence, then an error may 
occur with ML decoding when one of the other M — 1 messages is represented by another 
input sequence u' for which iy"(v|u') > iy"^(v|u), or equivalently i)(u',v) < i)(u, v). The 
following lemma gives an upper bound on the probability of error. 

Lemma 9. ( / l?5l p. 307]) Let Dq be an arbitrary constant, uq be a particular transmitted 
codeword, u be another randomly chosen input sequence, and v be an output sequence. The 
average probability of error satisfies the inequality Pf. < MPi + P 2 , where: 

Pi = Fr[L)(uo,v) < F)o,i)(u, v) < i)(uo,v)], (118) 

P 2 = Pr[F)(uo, v) > Do]. (119) 

For CCC, Pe is independent of uq. 

Since a CSCC is also a CCC, Lemma |9] will be used to bound the error probability for CSCC, 
while Lemma [8] will be used to compute P 2 (11191) . 

We now define some terms and notation which will be used later. We say that sequences v 
and v', each comprising of m subblocks of length L, have the same subblock-composition if the 
ith subblock in sequences v and V has the same composition, for 1 < i < m. The composition 
of the ith subblock of a sequence pair (u, v) is defined as a matrix whose (j, k) entry is equal to 
nfxj, yk)/L where nfixj^yk) is the number of letter pairs {xj, yk) in the fih subblock of (u, v). 
The subblock-composition of a sequence pair (u, v) is defined as a length m vector whose ith 
entry is the composition of the ith subblock of (u, v). 

The following lemma compares distances between sequences having the same subblock- 
composition. This lemma will be used to bound Pi (II181) . 

Lemma 10. Let u and v be two particular sequences with elements in X and y, respectively. 
Select equiprobably at random a sequence u' having the same subblock-composition as u, and 
a sequence V having the same subblock-composition as v. Then 

Pr[F)(u', v) < F)(u, v)] = Pr[F(u, v') < F(u, v)] (120) 


Proof: Let nfx) (resp. nfiy)) denote the number of occurrences of x (resp. y) in the ith 
subblock of u (resp. v). Let be the set of distinct subblock-compositions for sequence pairs 
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(u', v') for which D(u',V) < i)(u, v) and u' (resp. v') has the same subbloek-eomposition as 
u (resp. v). 

The number of sequences u', having the same subbloek-eomposition as u, for which v) < 
i)(u, v) is 

^ r. [ II"-*!')' \ 

En 


i=l 


y 


W_ni{x,y)\ 




( 121 ) 




The total number of sequences u' having the same subbloek-eomposition as u is 

/ \ 

L\ (L!)”^ 


n 

2=1 






/ 




( 122 ) 


i=i a" 


Thus, Pr[.D(u',v) < i)(u, v)] is equal to the ratio of (11211) to (11221) . 

The number of sequences v', having the same subblock-composition as v, for which D{u, v') < 
.D(u, v) is 

^ m [ n\ 

En 


2=1 


A' 


Ylni{x,y)\ 


(123) 


Ju,v y 

The total number of sequenees v' having the same subbloek-eomposition as v is 

/ \ 

{Lir 


n 

2=1 


L\ 


(124) 


i=i y 


Thus, Pr[ii)(u, v') < Z)(u, v)] is the ratio of (11231) to (11241) whieh is equal to the ratio of 
dUB to (fT22l) . ■ 


We now proeeed with the main steps leading to the proof of Theorem |7l 
From Lemma |9] it follows that the probability of error (averaged over mappings from messages 
to codewords) satisfies Pe < MPi -f P 2 , where P 2 can be bounded using Lemma [8] so that 
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P 2 < exp{—n/3), with 


/5 = s7'(s)-7(s) = Z^(l^||iy|P), 

7(s) = E P{x) log E exp (^sD{x,y)'^ W{y\x), 
X y 

exp ( sD{x, y )) W{y\x) 

y{y\x) = --r-; s > 0 

E^exp ysD{x,y)j W{y\x) 

W{y\xY-^f{yy ^ 

Y.yy^{y\xY~"f{yY' 

where D{x,y) is defined in (11151) and s is chosen such that 

V(s) = '^Pix)V{y\x)D{x,y) = 

x,y ^ 


(125) 

(126) 


(127) 


(128) 


Next, we derive an upper bound for Pi. If uq is the transmitted sequence, u' is a randomly 
chosen sequence having the same subblock-composition as uq, Vq is the set of sequences v for 
which i)(uo, v) < Dq, V'v is the set of sequences v' that have the same subblock-composition 
as V, l/'ov is the subset of Vy for which D{uo,v') < D{uo,v), then: 


P, = ^iy-(v|uo)Pr[P(u', v) < P(uo, v)] 


Vb 


(a) 


^fk”(v|uo)Pr[P(uo, v') < P(uo,v)] 


Vo 




.V Uo 


ll/( 


0v| 


Vo 




5^iy"(v|uo)P(v') 
(A) ^ VV_ 


V'- 


(129) 


where (a) follows from Lemma [TO and (b) follows because P(v') defined in (II171) depends 
only on the composition of v' and is same for all sequences in IP. 

We will first bound the denominator in (11291) . Let f(y'jx) denote a conditional probability 
distribution with f(y') = P(^)f(y'l^)- "^Ws defines P(v'|u) obtained as the product of the 

values of f(y'jx) for each corresponding pair of events of the sequences v' and u. If u is any 
sequence consisting of letters x £ X, then we define P(u) = Y\x where n(x) is the 

number of occurrences of letter x in u. If U is the space of all possible length n sequences 
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consisting of letters x E X, and Uq denotes the subset of sequenees having the given subbloek- 
eomposition, then 

FE') = E P(u)F(v'|u) > ^P(u)F(v |u) (130) 

u Uo 


It follows that 


V'v Uo V'v 


(c) 




where (c) follows because P(v'|u) is same for all u G Uq. Further, 

L\ 


Uo 






LP{x) 


a" 


(131) 

(132) 


= exp (—nr(L, P)) (133) 

with r{L, P) given by (l30l) . Thus, we have 

P(v') > exp {—nr{L, P)) ^ P(v'|u). (134) 

V'v V'v 

We will now bound the numerator in (11291) . For eaeh x E X, define the logarithm of the 
moment-generating function 


7a:(Wl, W 2 ) := log EE exp( wiD{x, y) + 

yey y'&y 

W 2 [b{x, y') - b{x, y)] )f{y')Wiy\b (135) 


where wi and W 2 are parameters assoeiated with the random variables b{x,y) and b{x,y') — 
D{x,y), respectively. Define the tilted probability distributions 

Qo{y,y'\x) := exp{wib{x,y) + W 2 [b{x,y') - b{x,y)] 

- j^{wi,W 2 )) f{y')W{y\x) (136) 


(5o(v, v'|uo) := exp(M;iP(uo, v) + te 2 [P(uo, v') - P(uo, v)] 

- n7o(wi,W2) )P(v')fk”(v|uo) (137) 
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where 7 o(tei,te 2 ) = ZIa”^ 2 )- From (11371) . it follows that 

fF”(v|uo)F(v') < exp (n 7 o(wi, W 2 ) - WiD{uo, v)) 

V'cv 

X y^Qo(v,v'|uo); ^2 < 0 (138) 

V"0v 

The RHS of (11381) is minimized by setting wi = 2w2 + 1, in whieh ease we have 


Qo{y,y'\x) = Qo{y\x)Qo{y'\x) 

1)/2 /(^)( l +«, i )/2 

(5o(v,v'|uo) = Qo(v|uo)(5o(v'|uo) 

Using (I138L (I141L and the faet that U'ov C U'v, we have 

lU'^(v|uo)F(v') < exp (^n7o(wi,W2) -teiU(uo,v)j 

V'ov 

X Qo(v|uo) y^(5o(v'|uo); wi < 1 

V'v 


(139) 

(140) 

(141) 


(142) 


If we let f{y'\x) = Qo{y'\x) then F(v') = Qo(v'|uo) and Pi ean be bounded using (11291) . (11341) . 
and (11421) as 


Pi < exp (?T,r(L, P) + n 7 o(tei, ^ 2 )) x 

y^exp ^-t(;iP>(uo, v)j (5o(v|uo); wi < 1 
vb 

Sinee all the sequenee v belonging to Vq have a distanee from uq whieh does not exeeed Dq, 
we have 

Pi < exp (nr(L, P) + n 7 o(tei, ^ 2 ) - WiDq) ; Wi < 0 (143) 

From (11281) . we know that Dq = 127 '(s), and RHS of (11431) is minimized when wi = 2s — l,W 2 = 
s — 1, and s satisfies 0 < s < 1/2. In this case, we have 

Qo{y\x) = V{y\x), (144) 

7 o(wi,te 2 ) = 27 ( 5 ) , (145) 

where V{y\x) is given by (11271) . Now (11431) takes the form 

Pi < exp (—n[( 2 s — l) 7 ^(s) — 27 ( 5 ) — r(L, P)]); 0 < s < 1/2 (146) 
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Note that since f{y\x) = Qo{y\x), it follows from (I144h that 

f{y) = PV{y) . 

Since R = and < MPi + P 2 , we have 

Pe < exp (-n[(2s - l) 7 '(s) - 27 ( 5 ) - {R + r(L, P))]) 
+ exp (—?7,[s7'(s) — 7 ( 5 )]); 0 < s < 1/2. 

Now, if we choose s such that 

(s - l) 7 '(s) - 7 (s) = R + r{L, P) ; 0 < s < 1/2 , 
then it follows from (11481) . (11491) . and (11251) that 

Pe < 2exp(-nP(l/||ll^|P)). 

From (11271) and (11471) we have 


V{y\x) = 


W{y\xY-^PV{yy 


( 147 ) 


(148) 


(149) 


(150) 


(151) 


T.yW{y\xY-^PV{yY 

Now (s — l)7'(s) — 7 (s) is a decreasing function of s for 0 < s < 1/2 (its derivative is 
(s — l) 7 "(s)). Let R denote its value at s = 1/2 (that is P = —0.57'(0.5) — 7(0.5). From (11251) . 
(11281) . and (11471) . it follows that condition (11491) can be equivalently be expressed as 


J(P, V) = R + r{L, P) , if P + r(L, P) > P. 


(152) 


If we let 

P' = P + r(L,P), (153) 

then using LemmaO (11511) . and (11521) . we observe that D(y\\W\P) = Egp (P', P, W), and (11501) 
is equivalent to 

Pe < 2 exp {-nEsp (P', P, W )); if P' > P. (154) 


Note that for 0 < s < 1/2, we have E^p (P', P, W) = sj'is) —7(5) and P' = (s —l)7'(s) —7(5). 
Thus 

^P,p(P',P,lL) = -^ ; 0<s<l/2 (155) 

and the slope of Egp (P', P, IF) with respect to P' at s = 1/2 (corresponding to P' = P) is — 1. 
For obtaining a bound on Pe when P' < P, we let radius Pq = oo- In this case. 


Pi = ^ P(v|uo) Pr[P(u', v) < P(uo, v)] (156) 
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where V is the spaee of all possible output sequenees. Note that F 2 = 0 when Do = oc. The 
upper bound for Pi in (11561) ean be ealeulated the same way as before. However, the eondition 
P(uo,v) < Do is eliminated by setting wi = 0 in (11351) . Here, the upper bound for Pi is 
obtained by setting ^2 = s — 1 and s = 1/2. The probability of error in this ease is bounded as 

Pg < exp (—n (— 27 ( 0 . 5 ) — R')); if P' < P 

= exp ^—n ^Esp(R, P, W) + P — ! if P' < P (157) 

and the proof is eomplete by combining (11531) . (11541) and (11571) . 

Appendix H 
Proof of Theorem [8] 

For the proof, we will construct a simple example of a symmetric DMC for which the uniform 
distribution over A does not achieve SECC capacity. 



Fig. 9. I(Xi = Xi; Y^) versus BSC crossover probability po for L = 2. 

Consider the following parameters for a BSC with crossover probability po'. 

6(0) = 0, 6(1) = 1, P = 0.5, L = 2, 0 < Po < 0.5 . (158) 
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With the above parameters, the input alphabet for the induced vector channel is given by ^ = 
{01,10,11}. A uniform distribution over A will achieve SECC capacity if and only if /(Xf" = 
xf; F/') is same for all e A [l^ Thm. 4.5.1], where 

/(A'f = xf;F.^)= (159) 

The proof is completed by numerically verifying that for BSC having parameters given by (11581) . 
I{X^ = 01; ^ I{X[ = 11] Y^^). Fig. m shows that /(Xf = 01; F^^) and /(Xf = 11; F^^) 

are different when 0 < po < 0.5. ■ 
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